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ABSTRACT 

Assuming a Euclid-like weak lensing data set, we compare different methods of dealing with 
its inherent parameter degeneracies. Including priors into a data analysis can mask the infor¬ 
mation content of a given data set alone. However, since the information content of a data 
set is usually estimated with the Fisher matrix, priors are added in order to enforce an ap¬ 
proximately Gaussian likelihood. Here, we compare priorless forecasts to more conventional 
forecasts that use priors. We find strongly non-Gaussian likelihoods for 2d-weak lensing if no 
priors are used, which we approximate with the DALI-expansion. Without priors, the Fisher 
matrix of the 2d-weak lensing likelihood includes unphysical values of and h, since it 
does not capture the shape of the likelihood well. The Cramer-Rao inequality then does not 
need to apply. We find that DALI and Monte Carlo Markov Chains predict the presence of a 
dark energy with high significance, whereas a Fisher forecast of the same data set also allows 
decelerated expansion. We also find that a 2d-weak lensing analysis provides a sharp lower 
limit on the Hubble constant ofh> 0.4, even if the equation of state of dark energy is jointly 
constrained by the data. This is not predicted by the Fisher matrix and usually masked in other 
works by a sharp prior on h. Additionally, we find that DALI estimates Figures of Merit in the 
presence of non-Gaussianities better than the Fisher matrix. We additionally demonstrate how 
DALI allows switching to a Hamiltonian Monte Carlo sampling of a highly curved likelihood 
with acceptance rates of 0.5, an effective covering of the parameter space, and numeri¬ 
cally effectively costless leapfrog steps. This shows how quick forecasts can be upgraded to 
accurate forecasts whenever needed. Results were gained with the public code from DALI. 
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1 INTRODUCTION 

Weak cosmic lensing is currently a field of intense focus: It al¬ 
lows the measurement of the cosmological parameters especially 
in the late Universe and is therefore an ideal probe for dark energy 
physics or models of modified gravity. After the first significant de¬ 
tections (Kaiser et al. 2000; Wittman et al. 2000; Bacon et al. 2000; 
Van Waerbeke et al. 2000), weak gravitational lensing has been ob¬ 
served with increasing singificance by e.g. CFHTLenS (Kilbinger 
et al. 2013; Hey mans et al. 2013), allowing the determination of 
cosmological parameters. 

In the future, weak lensing will be investigated on about a third 
of the sky with the upcoming Euclid satellite (Laureijs et al. 2011). 
While the Euclid data set is not yet available, its constraining power 
on different extensions of the current cosmological standard model 
is being forecasted, see e.g. Amendola & Tsujikawa (2010). Also, 
statistical techniques are being improved, or the data analysis is be¬ 
ing refined, for example by switching from a two dimensional weak 
lensing analysis to weak lensing tomography (Hu 1999, 2002) and 
3d weak cosmic shear (Heavens 2003; Castro et al. 2005; Heavens 
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et al. 2006), or by including higher-order polyspectra of the weak 
lensing shear (Munshi et al. 2010), or by combining lensing with 
other tracers of cosmological structure growth. There will be large- 
scale lensing surveys on the way to Euclid with an emphasis on 
dark energy, for instance the Kilo-degree Survey (KidS) (de Jong 
et al. 2013) and the Dark-Energy-survey (DBS) (Melchior et al. 
2015). 

All these different methods need a tool in order to asess the in¬ 
formation content of the data set under a specific analysis. Usually, 
the wish is to quickly forecast the resulting likelihood constraints or 
Figure of Merits, and sometimes also Bayesian evidences. In prin¬ 
ciple, Monte Carlo Markov Chains (MCMC), Nested Sampling or 
grid-based likelihood evaluations are a well suited tool for these 
aims, but they are very time consuming. Quick estimates of the 
above quantities are then usually done with the Fisher matrix ap¬ 
proach (Tegmark et al. 1997) which hinges on the assumption of 
the likelihood being well approximated by a multivariate Gaussian. 
However, a Gaussian likelihood can only be gained under addi¬ 
tional assumptions about the data set and the parametric model that 
is fitted to the data: 

The Gaussian shape of the likelihood will only be achieved 
if the signal depends linearly on the model parameters. If the 
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data depends non-linearly on parameters, this non-linearity can 
cause the likelihood to be non-Gaussian if the data set is not well- 
constraining such that a linear Taylor approximation around the 
maximum likelihood point is enough to capture the variation of 
the physical model within the parameter space that is preferred by 
the data. Particularly severe non-Gaussianities can be expected if 
non-linear parameters are in addition strongly degenerate with each 
other. 

Such non-Gaussianities lead to a broad variety of likelihood 
contour shapes, amongst which banana-shapes are often observed, 
as well as straight but asymmetric contours. Such asymmetries can 
be for example introduced, if the best fit point lies close to the 
boundary of an unphysical region. 

These non-ellipsoidal and non-symmetric likelihood shapes 
are not captured by the Fisher matrix analysis. The Fisher ma¬ 
trix might then result in confidence contours that easily extend 
beyond the physically meaningful parameter range. The problem 
can worsen if parameters are marginalized over: In cases where 
the Fisher matrix captures the orientation of the likelihood wrongly 
along one parameter direction, all other parameters will be affected 
by this, since they are being constrained jointly. If the Fisher ma¬ 
trix extents into unphysical regions, the Cramer-Rao bound also 
does not need to be fulfilled, since it holds under the condition that 
the Fisher information is defined and finite everywhere within the 
covered data and parameter space. We discuss this issue in Sect. 5 
for the example of 2d weak lensing. 

All these problems are traditionally addressed by combin¬ 
ing likelihoods from different probes, or by imposing priors such 
that parameter degeneracies are broken and the combined likeli¬ 
hood is more sharply peaked and therefore confined to the phys¬ 
ically meaningful parameter space. This solves the above prob¬ 
lems by removing non-Gaussianities. Another solution would be 
to accurately capture existing non-Gaussianities. The latter is pos¬ 
sible with the DALI-approach (Sellentin et al. 2014, henceforth 
SAQ2014)(Sellentin 2015), which we shall in the following com¬ 
pare to the Fisher matrix approach and to MCMC-evaluated likeli¬ 
hoods. 

We adopt the Einstein sum convention for repeated indices. 
Our cosmological parameter set consists of ^ = (Qm^crg,ns,Kw) 
which are the density of cold dark matter today, the nor¬ 
malization of the power spectrum, the primordial spectral in¬ 
dex, the Hubble constant and a redshift independent dark en¬ 
ergy equation of state parameter. Our fiducial cosmology is 
Qm - 0.25, cTg = O.S^ris = 0.96, h - 0.7, w = -0.98. We keep the 
density of baryons fixed to - 0.04. 

This paper is organized as follows: we describe the model¬ 
ing of the weak lensing observations and why degeneracies can 
be expected in Sect. 2. The likelihood and its approximations are 
described in Sects. 3 and 4. Sect. 5 contains a comparison of the 
Fisher matrix, DALI and MCMC, and describes the advantages 
of accounting for non-Gaussian degeneracies instead of removing 
them by the use of priors. We use DALI as approximate potential 
for a Hamilton Monte Carlo Sampler in Sect. 7. Sect. 8 presents a 
summary of our results. 


2 COSMOLOGY AND WEAK LENSING 

In spatially fiat dark energy cosmologies with redshift independent 
equation of state parameter iv, one obtains for the Hubble function 


H(a) = dlna/dt, 

l-n„. 

Hi ~a^ ^ ’ 

The comoving distance is related to the scale factor a through 

where the Hubble distance= c/Hq is the natural unit for cosmo¬ 
logical distance measures. Small fiuctuations 6 in the distribution 
of cold dark matter grow in the linear regime |b| 1 (Linder & 

Jenkins 2003) according to 

+ i [3 + = o, (3) 

da^ a \ dm a I da 2a^ 


and their statistics is characterised by the spectrum (6(k)6(k')) = 
{InfSoik + k')Ps(k) with the ansatz Ps{k) oc k'^^T^(k) using the 
transfer function T(k), while they grow proportionally to the growth 
function D+{a) - 5{a)l5{l). The transfer function depends on the 
Hubble constant through the shape parameter F = (Bardeen 
et al. 1986; Sugiyama 1995). The spectrum is normalised to the 
value cTg, 

X "” k^dk 

^ W\S Mpc/h X k) Pdk), (4) 

with a Fourier-transformed spherical top-hat W(x) = 3ji{x)/x as 
the filter function, where ji(x) is the spherical Bessel function of 
the first kind. From the CDM spectrum of the density perturbations 
the spectrum of the dimensionless Newtonian gravitational poten¬ 
tial O can be obtained 

P,t(k) “ T{kf, (5) 

by applying the comoving Poisson-equation AO = 3Qml(2xjj)S for 
converting between density contrast S and gravitational potential O. 
Additional variance of the cosmic density field on nonlinear scales 
is described by Smith et al. (2003), which we include in our mod¬ 
elling. 

Weak gravitational lensing probes the tidal gravitational fields 
of the cosmic large-scale structure by the distortion of light bundles 
(Bartelmann & Schneider 2001; Bartelmann 2010). To make best 
use of the cosmological information, one divides the galaxy sam¬ 
ple from which shape correlation functions or spectra are estimated 
(Takada & White 2004; Amara & Refregier 2007; Huterer & White 
2005; Jain & Taylor 2003; Takada & Jain 2004; Schafer & Heisen¬ 
berg 2012), into ^bin redshift intervals and computes the lensing 
potential ij/ at the position 6 for each redshift bin i separately, 

r\xWi(x)^, ( 6 ) 

Jo 

hence is related to the gravitational potential O by projection 
with the weight function Wi(x) which contains physical informa¬ 
tion since it depends depends on background geometry and also on 
the growth of matter as 


W,0f) = 2 


D^(a) Gi(x) 

a X 


(7) 


Modes of the lensing potential are obtained by the de¬ 
composition = / dD into spherical harmonics 

Yinti's). The distribution piz)dz of the lensed galaxies in redshift 
is incorporated in the lensing efficiency function Giix), 


Giix) = 


£ 


'Xi+\ 

minOrori) 


dV pW) 


dz 

Ax' 



( 8 ) 
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with dz/dx' = H(x')lc and the bin edges and;^f/+i, respectively. 
Euclid forecasts commonly use the parameterisation (Refregier & 
the DUNE collaboration 2008) 

p(z)dz ^ j 

Angular spectra of the tomographic weak lensing potential 

can be written as the variance P ~ which 

we approximate by the corresponding flat-sky expression, 

c^,iid)= n ^ WiMWjix) P^ik = £M- (10) 

Jo X 

The convergence k and the shear y follow by double differentiation 
of the lensing potential with respect to angles, k - therefore 

their spectra are equal to Observed spectra of the weak 

lensing shear will contain a constant contribution crln\)inlh known 
as shape noise, which translates into 

c^jjd) = C^.ij(0 + crl'^xd 6ij, (11) 

which will be at the same time the covariance matrix for measure¬ 
ments of modes if/imj- is non-zero for redshift bins i ^ j 

because light rays share the section between the observer and the 
closer tomography bin, such that they contain partially the same 
statistical information, leading to a non-vanishing covariance. We 
will mainly work with 2d-weak lensing, but for an additional com¬ 
parison with 2-bin tomography, we choose bins such that they con¬ 
tain the same number h of galaxies. 


2.1 Curved degeneracy lines in weak lensing 

Weak gravitational lensing derives its sensitivity on cosmological 
parameters from a combination of the amplitudes of gravitational 
potentials and geometry: Directly, it depends on (D^crg)^ because 
this sets the strength of the gravitational potential. Therefore, the 
hyperbolic degeneracy line between these two parameters will fol¬ 
low Qm ^ The gravitational potentials grow with D+{a)la, 
which depends also on the dark energy equation of state parameter 
IV. The dark energy equation of state also influences the conversion 
between distance measures and redshift because the expansion of 
the Universe is modified. More negative iv makes comoving dis¬ 
tances for a given redshift larger, thus increasing the lensing sig¬ 
nal. The shape of the spectrum P{k) is determined by Us, h and 
Qm, which is reflected in the spectra more weakly due to 

the weighting functions Wtix)- The shape parameter of the power 
spectrum is T = which introduces a degeneracy between 
and h which will again be a hyperbolic line. 


3 THE UNAPPROXIMATED LIKELIHOOD 

for statistically homogeneous random fields, weak gravitational 
lensing yields independent modes in multipole i and in the case 
of statistical isotropy, a measurement ofli +l independent modes 
for each multipole. The likelihood for a model C^(^) to able to re¬ 
produce the set [4fimA of observed modes separates in the ideal 
case in i and m according to 

. ( 12 ) 
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Figure 1. The marginalized Fisher matrix (grey) in the D^^^-plane. The 
blue rectangle indicates the area bounded by the constraints h > 0, Q.m > 0 
which might be interpreted as minimal priors that could be applied to foster 
the constraining power of the weak lensing data set. 


point spread function lead to a coupling of different Although 
of course relevant for analyses of real data sets, this is often omit¬ 
ted for forecasting (Hannestad et al. 2006) because non-diagonal 
or non-block diagonal covariance matrices are much harder to in¬ 
vert. Eor a Monte Carlo sampler, this inversion is necessary for each 
sample, and we therefore assume that the different ^-modes decou¬ 
ple for the sake of speed. 

The likelihood for each observed mode xj/^md if the theory pre¬ 
dicts a covariance is Gaussian in the data. 




^(27r)"bindet4(^) 


exp|-2'/'<'m,i(C^(^) ‘),viAf™j|, 


(13) 


due the fact that both the cosmic structures as well as the noise are 
approximately Gaussian random fields. Consequently, the logarith¬ 
mic likelihood L = - In X is up to an additive constant equal to 


_^ 2^ _i_ ^ 

T — ^ ~ ^trlnCj^ -|- )/y 4^im,i4^£m,^ 


(14) 


by using the relation detlnC = trlnC for the matrix 
indexed by the tomography bin numbers. We will often refer to 
Eq. (14) as the true likelihood since no approximations apart from 
physical approximations such as the flat sky approximation and in¬ 
tegrating the lensing signal along a straight line were used so far, 
and the assumption that the projected lensing potential has nearly 
Gaussian fluctuation statistics. 

We model Euclid’s weak lensing survey (Laureijs et al. 2011) 
to reach out to a median redshift of 0.9 and to yield h = 4.8 x 
10^ galaxies per steradian. We assume that the shape measurement 
produces a Gaussian noise with standard deviation cr^ = 0.3. We 
use a sky fraction of /sky = 0.35 and a multipole range of 30 to 
3000. 


4 THE DIFFERENT LIKELIHOOD APPROXIMATIONS 


where the equal likelihoods of all modes m at fixed £ have been 
multiplied. 

Observational issues like an incomplete sky coverage and a 


The Fisher matrix approach and DALI use derivatives at the best fit 
point to approximate Eq. (14) with increasing precision: From the 
data-averaged curvature of the logarithmic likelihood one 
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Figure 2. Comparison of the different likelihood approximations with the MCMC-sampled likelihood for 2d-weak lensing. The contours enclose 68%, 95% 
and 99% of the likelihood. Solid grey: Fisher approximation combined with the additional constraints Q.m > 0 and h > 0 (implemented by sampling from 
the Fisher matrix and discarding all unphysical samples before marginalizing). DALI with second-order derivatives of the covariance matrix is shown in open 
green contours, DALI with second and third derivatives is shown in open blue contours. The dots in the bottom right panel are samples drawn from the 
Fisher matrix approximated likelihood and are predicted to be points of high likelihood by the Fisher matrix. However, when calculating the unapproximated 
likelihood of these samples, they turn out to be extremely unlikely parameter combinations. This demonstrates that the sharp cutoff towards lower h in the 
bottom right panel is correct. In the top left and bottom right panel, the likelihood asymptotes roughly towards h ^ 0.4, and the cutoff in the Ug, /z-plane is just 
a different projection of this behaviour. The purple line in the bottom left panel indicates the constraint w < -1/3 for accelerated expansion, and it can be seen 
that over ~ 90% of the MCMC and DALI contours fall within the parameter space of accelerated expansion, thereby indicating strongly the presence of a dark 
energy, whereas ~ 30% of the Fisher matrix cover parameter regions that would not lead to accelerated expansion. 


derives the Fisher matrix (Tegmark et al. 1997; Heavens 2003), 

£ 

with being the derivatives with respect to individual cosmolog¬ 
ical parameters, - {Qm^crg,ns,h,w). The Fisher matrix F^y al¬ 
lows the construction of a Gaussian likelihood of inferred parame¬ 
ters, 

oc exp |-^Av^F^yAvyj (16) 

with the distances Av^ of the parameters from the best fit point. 

DALI then introduces higher order derivatives of the likeli¬ 
hood in order to recover non-Gaussianities. To fourth order in Ax^ 


the DALI-approximated likelihood reads 
\np{x^) = 

- \ + 1) AxpAxv 

£ 

- ^ ^(2^ + 1) ir(cf^^^yC^|,Cf^yCfjAx^AxyAxy 

£ 

- ^ ^(2^ + 1) d^dyC^i,Cf dydsCfj Ax^AxyAxyAxs 
+ O(Ax^), 

(17) 

see Sellentin (2015). Here, the index of the covariance matrix has 
been suppressed for brevity. From the second line of Eq. (17) on¬ 
wards, the higher order derivatives of the covariance matrix give ac- 
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cess to the non-linearity of the parameters. DALI can also include 
higher-order derivatives of in principle arbitrary order. For exam¬ 
ple, a first estimate of non-Gaussianities present in the model can 
be achieved by including second derivatives with DALI, a refined 
estimate can be achieved by including also third order derivatives. 
In order to make DALI fast, stopping at third order derivatives is 
advised but not mandatory. 


5 FORECASTING WITH NON-GAUSSIANITIES 

In the following, we compare which information about the pre¬ 
ferred parameter space can be extracted from the weak lensing 
data set from Sect. 2 when analysed with the Fisher matrix, DALI 
and MCMC. Additionally, we compare the Figure of Merits (FoM) 
from the different approximations: The Fisher matrix allows a con¬ 
venient definition of a FoM via the determinant of 2x2 submatrices 

FoM = VdetF 2 x 2 - (18) 

This corresponds to using the area enclosed by a chosen confidence 
contour in a given parameter plane as a FoM. We generalize this 
concept to our non-Gaussian forecasts by defining that the FoM 
shall be the area enclosed by the 95%-confidence contour. 

We begin by evaluating the Fisher matrix for this setup. Fig. 1 
shows the marginalized Fisher matrix approximated likelihood in 
the Qni, /i-plane. Clearly, the Fisher matrix reaches far into unphys¬ 
ical regions of negative It also covers regions of negative Hub¬ 
ble constants. Sensitivity with respect to the Hubble constant enters 
weak lensing through the shape parameter F = of the power 
spectrum. However, the shape parameter is a length scale and must 
therefore be positive definite. Negative h and are therefore un- 
sensical in the chosen parameterization of the power spectrum via 
a shape parameter, and these negative values must be excluded. 

This shows, that the Fisher matrix cannot be used for a 2d 
weak lensing analysis for the Euclid satellite without enforcing by 
priors that the shape parameter has to be positive definite. In the 
appendix we discuss shortly why the Cramer-Rao inequality does 
not hold if unphysical parameter ranges are covered by the Fisher 
matrix. 

For the comparison of the Fisher matrix with DALI and 
MCMC, we therefore augment the Fisher matrix with the prior 
knowledge > 0, /z > 0. In practice, we implement this by draw¬ 
ing samples from the Fisher matrix approximated likelihood, and 
discarding all samples that fall into the unphysical regions. The 
introduction of these sharp cutoffs in and h leads to a non- 
Gaussian likelihood approximation. This approximation is depicted 
in grey in Fig. 2 and was also used for comparing FoMs in Fig. 3. 

From a Fisher matrix analysis one would conclude that a 2d 
weak lensing analysis of a Euclid like survey does not allow to put a 
lower bound on the Hubble constant of our Universe. Additionally, 
the large uncertainty in h and leads to rather loose constraints 
of the remaining parameters erg, ris and w. 

A comparison with MCMC-sampled likelihoods shows that 
the data are actually more constraining than predicted by the Eisher 
matrix, and a DALI-evaluation of the likelihood contours reveals 
that the problem is entirely due to non-Gaussianities and degenera¬ 
cies between non-linear parameters. 

In Eig. 2, a comparison between the Eisher matrix, DALI and 
MCMC-samples of the likelihood is shown. Eor MCMC and DALI, 
no prior constraints like > 0 were used. Highly curved degen¬ 
eracy lines and asymmetric likelihood shapes are evident. These 
curved degeneracy lines are well approximated by DALI, although 
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Figure 3. Figure of merit from the different approximations, relative to 
the MCMC figure of merit. The non-Gaussian DALI-approximations al¬ 
ways perform better than the Fisher matrix, although no clear trend can 
be made out. However, the DALI-FoMs differ by maximally ~ 30% from 
the MCMC-FoM, whereas the Fisher-FoM differs about two times more, 
namely by up to ~ 65%. 

not perfectly. As the likelihood asymptotes ioh ^ 0.4 in the h- 
plane and in the h, iv-plane, negative h are excluded without the 
use of any priors. This shows that the 2d weak lensing analysis is 
able to predict a lower bound ofh> 0.5 on its own. Also due to 
the highly curved likelihood shapes, Clm does not become negative 
but stays confined to the physical region. These strong changes in 
the allowed range of and h in comparison to the Fisher ma¬ 
trix, propagate into the constraints of the remaining parameters 
crg,ns, w. For dark energy, the curved DALI-approximation predicts 
0.3 > w > -2.0. In contrast, the Fisher matrix allows much smaller 
and even positive iv. This is interesting for the forecasting of dark 
energy constraints: An accelerated expansion of the universe re¬ 
quires w < -1/3. About one third of the Fisher matrix covers how¬ 
ever the parameter space iv > -1/3, and only two thirds fall into 
the parameter range of accelerated expansion. In contrast, DALI 
and MCMC both favour the accelerated expansion by a much larger 
degree: about 90% of their preferred parameter range corresponds 
to an accelerating universe. Note, that the fact that the Fisher ma¬ 
trix also covers parameter regions of decelerated expansion stems 
from it being by construction symmetric around the best fit point. 
Also in SAQ2014, we observed that this high symmetry leads to the 
Fisher matrix covering parameter ranges of decelerated expansion, 
whereas the real likelihood did not. 

The FoMs in the different parameter planes are compared 
in Fig. 3, demonstrating that the DALI-FoMs are closer to the 
MCMC-FoM than the Fisher-FoM. 

In summary, the width of the Fisher ellipse perpendicular to 
the main degeneracy directions is in good agreement with the width 
of the true likelihood, whereas the semi-major axes parallel to the 
degeneracy lines are overestimated. DALI instead estimates well 
the constraining power of the data also along directions of strong 
degeneracies. This comparison shows, that the correct modelling of 
degeneracies between non-linear parameters can remove the neces¬ 
sity to break degeneracies by imposing priors in order to establish a 
Gaussian likelihood. The major advantage of modelling parameter 
degeneracies instead of breaking them with priors is that it allows 
to care only about a single data set, and to optimize its scientific re¬ 
turn independently of external measurements. As a comparison, we 
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shall in the following nonetheless show whether non-Gaussianities 
remain if a prior on h is introduced or 2-bin tomography is per¬ 
formed. 


6 BREAKING DEGENERACIES 

Generally speaking, two ways of breaking degeneracies in a data set 
exist: Either, the data set is combined with external measurements, 
or the analysis of the data set is substantially improved. For weak 
lensing, the first option could be for example implemented by com¬ 
bining the data set with a prior on the Hubble constant. The second 
option could be implemented by changing from a 2d-analysis to 
tomographic weak lensing, since tomographic weak lensing is able 
to break degeneracies between cosmic parameters by the additional 
redshift information. 

As a complement to the parameter constraints in Sect. 5, we 
investigate which non-Gaussianities are still present if we combine 
the 2d-weak lensing survey with a Gaussian prior on h with stan¬ 
dard deviation cTh - 0.03, roughly corresponding to the precision of 
current local constraints on the Hubble constant (Riess et al. 201 1). 
Fig. 4 shows that even when using this prior, the posterior likeli¬ 
hood is not peaked sharply enough but non-Gaussianities remain. 
It also shows that switching to 2-bin tomography outperforms the 
inclusion of this prior into a 2d weak lensing analysis. 

Figure Al shows a comparison of the Fisher- and DALI- 
forecasts and MCMC for 2-bin tomography. The likelihood is then 
well approximated by a multivariate Gaussian. If one were to in¬ 
clude more model parameters, e.g. a redshift dependent equation 
of state for dark energy, the likelihood would widen again, poten¬ 
tially leading to a non-Gaussian shape. 


7 USING DALI AS APPROXIMATE POTENTIAL FOR 
HAMILTON MONTE CARLO 

In order to have a comparison for DALI and the Fisher matrix, we 
generated MCMC-samples of the unapproximated likelihood. 

The Metropolis-Hastings algorithm (Metropolis 1985) works 
well for approximately multivariate Gaussian likelihoods but has 
problems with following highly curved likelihoods. We therefore 
employed a Hamilton-Monte-Carlo (HMC) sampler, which uses 
Hamiltonian dynamics for describing a random walk on a poten¬ 
tial P corresponding to the logarithmic likelihood, 

P(v^) = -ln(X), (19) 

and a kinetic energy to introduce the randomness needed for sam¬ 
pling. 

The algorithm takes multiple leap frog steps along contours of 
approximately constant likelihood before performing a Metropolis- 
Hastings step by which it decides whether the new point is ac¬ 
cepted or rejected (Hajian 2007). For each leap frog step, the HMC- 
sampler takes derivatives of the logarithmic likelihood and follows 
these, thereby adjusting well to curved likelihoods. 

Calculating derivatives of the true log-likelihood can be nu¬ 
merically costly. A gain in performance can then be achieved if 
the log-likelihood is replaced by an approximation which is fast 
to evaluate, such as DALI. Consequently, we do not use the log- 
likelihood of Eq. (14) for the sampler, but the DALI-approximation 
Eq. (17) for the leap frog steps along the potential. Calculating the 
true weak-lensing likelihood is then only needed in the Metropolis- 
Hastings steps. This procedure speeds up the performance of our 



Figure 6. Samples from a (for plotting thinned) Hamilton Monte Carlo 
Chain: For the approximate potential, the tempered DALI-likelihood 
Eq. (19) was used. Rejected samples are depicted in red, accepted sam¬ 
ples are depicted in blue. Clearly visible is a red rim where accepted and 
rejected samples do not mix, demonstrating that the sampler was able to 
reach all points in parameter space that are preferred by the data. In the 
other two-dimensional planes, there also exists a ring of rejected samples. 

sampler by a factor ranging between 30 and 100, depending on 
how many leap frog steps were done in each iteration of the HMC- 
algorithm. 

A potential issue with using DALI-contours to guide an HMC 
sampler is that DALI might exclude regions of the parameter space 
that are actually preferred by the true likelihood. In order to avoid 
this problem, we introduce a temperature to widen the potential 
Eq. (19), 

\nP{x,)^\nP{x,)IT. (20) 

If the temperature is set too high, the contours of the potential 
Eq. (20) will not generate samples that follow the true likelihood 
well. This leads to a reduction of the acceptance rate. We find that 
T - 3 leads to an acceptance rate between 0.3 and 0.5 while still 
giving the sampler the possibility to reach all regions in parameter 
space that are erroneously not covered by the DALI-approximation. 
In Fig. 6 we plot samples of such an MCMC-chain, demonstrating 
that the sampler has indeed been able to cover the true likelihood 
fully: The accepted samples are surrounded by a rim of rejected 
samples. This rim shows that the sampler had the chance to explore 
regions of parameter space with low likelihood. 

In contrast, using the Fisher matrix as an approximate poten¬ 
tial for the HMC-sampler has proven ineffective: since it does not 
capture the curvature of the likelihood, the sampler is frequently 
guided towards regions of extremely low likelihood if it follows the 
isocontours of the Fisher approximation. Consequently, even after 
adjusting the number of leap frog steps, no higher acceptance rate 
than 0.02 in our application could be gained, while many regions of 
the preferred parameter space were not sampled (in an acceptable 
time) at all. 


8 SUMMARY 

In this paper, we have investigated different methods of how to treat 
parameter degeneracies when estimating the information content of 
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Figure 4. A comparison of 2d weak leasing and 2-bin tomography. Yellow: DALI-forecasted constraints for the 2d weak leasing analysis combined with a 
Gaussian prior on h from local measurements. Since DALI with second order derivatives and DALI with second and third order derivatives agree very well, 
only the latter are shown. Blue: MCMC-sampled likelihood for a 2-bin tomography analysis of the same data set without using a prior on h. 
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Figure 5. Two-bin tomography for the same weak leasing survey is able to break the degeneracies inherent in the parameter set, such that the overall constraints 
become much tighter and the Fisher matrix (grey contours) agree well with the MCMC-contours (solid blue). The MCMC-samples were generated with the 
Metropolis-Hastings algorithm. DALI finds nearly the same confidence contours as the Fisher matrix since only minor non-Gaussianity is present. 
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a Euclid like weak lensing data set by calculating parameter con¬ 
straints. The three possible ways are: (/) break the degeneracies 
by using priors (//) include the parameter degeneracies by using 
a non-Gaussian likelihood and (///) break the degeneracies by sub¬ 
stantially changing the data analysis, in this case changing from 
2d-weak lensing to lensing tomography. 

Additionally, we have demonstrated how the DALI- 
approximation allows to guide a Hamilton Monte Carlo sampler, 
such that highly curved likelihoods can be effectively sampled. 
For 2d weak lensing we found strong non-Gaussianities in the 
likelihoods and: 

• The Fisher matrix then extends far into the unphysical param¬ 
eter range Qm < 0 and h < 0, since the Gaussian approximation to 
the actual likelihood is not particularly good. 

• Introducing the priors h > 0 and > 0 makes the Fisher 
matrix approximated likelihood similar to the MCMC-likelihood. 
The Cramer-Rao inequality does not need to be fulfilled when un¬ 
physical parameter regions are covered by the Fisher matrix. 

• The Fisher matrix predicts that 2d-weak lensing at Euclid 
precision will not be able to put a lower limit on the Hubble 
constant, whereas we find with MCMC-evaluations and DALI- 
approximations of the likelihood that h > 0.4. 

• The reason why the Fisher matrix fails are strong hyperbolic 
parameter degeneracies. Breaking these degeneracies by including 
priors masks that the actual weak lensing likelihood is able to mea¬ 
sure the cosmological parameters including h without the aid of 
external data sets. 

• We did not require the inclusion of any priors for DALI: the 
fact that it captures non-Gaussianities was sufficient for DALI be¬ 
ing in agreement with the MCMC-sampled likelihood. 

• Due to its inherent high symmetry, the Fisher matrix does not 
hint at the presence of dark energy as strongly, as an MCMC- 
evaluation or a DALI-approximation do: about a third of the 
Fisher matrix falls into parameter ranges of decelerated expansion, 
whereas about 90% of the DALI- and MCMC-likelihood fall into 
the region of accelerated expansion. 

• The Figures of Merit (FoM) from the non-Gaussian DALI ap¬ 
proximation are better in agreement with the FoM from MCMC, 
than the FoM from the Fisher matrix in the presence of strong non- 
Gaussianities: The DALI-FoM was at most ~ 30% larger than the 
MCMC-FoM, whereas the Fisher-FoM was up to 65% too large. 

• Using DALI as an approximate potential for a Hamiltonian 
Monte Carlo sampler speeds up the sampling in a two-fold way: 
Firstly, all leapfrog steps become effectively numerically costless 
since evaluating the DALI likelihood is extremely fast. Secondly, 
as the DALI-contours already follow very well the isocontours of 
the real likelihood, the sampler is being guided towards relevant 
regions in parameter space. This increases the acceptance ratio. In 
our application, we were able to increase the acceptance rate from 
0.02 to 0.3 - 0.5, meaning the sampler needs to try at least 15 times 
less samples in order to achieve the same number of accepted sam¬ 
ples. Simultaneously, the evaluation time for each sample was on 
average cut down by a factor of about 80, since the leapfrog steps 
did not require the calculation of the real likelihood anymore. 

Even after introducing a prior of local measurements of 
the Hubble constant into the 2d weak lensing analysis, non- 
Gaussianities remain in the combined likelihood. However, these 
are much less pronounced. We also found that tomographic weak 
lensing without a prior on h leads to tighter parameter constraints 
than 2d weak lensing with a prior on h. For our five dimensional 
parameter set erg, Us, h, iv, the likelihood for 2-bin tomography 




Figure Al. Replacement of the Fig. 2c and Fig. 3c in (Sellentin et al. 2014), 
using the new public version of DALI where third derivatives are calculated 
more accurately. The DALI-likelihood contours now follow the true like¬ 
lihood better demonstrating that the observed mismatch between the true 
likelihood (grey) in (Sellentin et al. 2014) and the DALI-contours was only 
due to numerically crude estimates. 


was already well Gaussian. Potentially, this might change if a red- 
shift dependence of the dark energy equation of state is allowed. 
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APPENDIX A: IMPROVED THIRD DERIVATIVES 

Already from the Fisher matrix approach it is known that numer¬ 
ical derivatives must be estimated accurately since they determine 
the extent and the orientation of the Fisher matrix. This issue also 
affects DALI since it also uses numerical derivatives. In SAQ2014, 
DALI was applied to a data set of supernovae. There, the DALI- 
contours of Figs. 2c and 3c had been observed to leak out of the 
true likelihood shape. These plots had been generated with an old 
version of the DALI code which used a numerically fast but rough 
algorithm for calculating third derivatives: it took another deriva¬ 
tive of precomputed and splined second derivatives. The new pub¬ 
lic version of DALI uses a slower but more accurate routine for 
calculating third derivatives: it calculates them by using finite dif¬ 
ferences on the original function, not any already derived quanti¬ 
ties. Redoing the analysis of SAQ2014 with the improved code 
results in Fig. Al, demonstrating that with the more carefully con¬ 
ducted estimate of third derivatives, the erroneous leakage of the 
DALI-likelihoods disappears. 


APPENDIX B: THE CRAMER-RAO INEQUALITY AND 
UNPHYSICAL PARAMETER RANGES 

The Cramer-Rao inequality does not need to apply if the Fisher 
matrix covers unphysical parameter ranges. To illustrate this, we 
imagine a distribution function f{d, 0), where d is the data set, and 
for simplicity only one parameter 6 shall be estimated (else, one 
would simply need to marginalize over the other parameters). 
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The Cramer-Rao inequality actually holds for the Fisher- 
information J, which is the averaged squared gradient of the log- 
likelihood, 

I = J fid, 0)[de log(/(rf, e))fAd. (B1) 

In contrast, the Fisher matrix is the averaged curvature of the nega¬ 
tive log-likelihood 

= logifid, md. (B2) 

Explicitely calculating the second derivatives in Eq. B2 shows 
that the Eisher matrix and the Eisher information are related by 

Fee = I - Jdedefid,0)dd (B3) 

In order for the Fisher matrix to be identical to the Fisher informa¬ 
tion, the second term must vanish, which will be the case if the dif¬ 
ferentiation with respect to the parameters and the averaging over 
the data interchange. This however requires that the distribution 
f(d, 6) and its derivatives exist for all combinations of the data and 
the parameters and are finite. In the case of the Fisher matrix from 
Sect. 5, this is not fulfilled and the Cramer-Rao inequality then does 
not need to apply. 
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